Reinforcement-Learning-Based Adaptive Optimal Flight Control with Output Feedback and Input Constraints
نویسندگان
چکیده
No AccessEngineering NotesReinforcement-Learning-Based Adaptive Optimal Flight Control with Output Feedback and Input ConstraintsBo Sun Erik-Jan van KampenBo https://orcid.org/0000-0002-5229-8545Delft University of Technology, 2629 HS Delft, The Netherlands*Ph.D. Student, Operations Department, Faculty Aerospace Engineering, South Holland.Search for more papers by this author Kampen https://orcid.org/0000-0002-5593-4471Delft Netherlands†Assistant Professor, Holland. Member AIAA.Search authorPublished Online:8 Jun 2021https://doi.org/10.2514/1.G005715SectionsRead Now ToolsAdd to favoritesDownload citationTrack citations ShareShare onFacebookTwitterLinked InRedditEmail About References [1] B. E.-J., “Incremental Model-Based Global Dual Heuristic Programming Explicit Analytical Calculations Applied Control,” Engineering Applications Artificial Intelligence, Vol. 89, March 2020, Paper 103425. https://doi.org/10.1016/j.engappai.2019.103425 Google Scholar[2] Junell J., Mannucci T., Zhou Y. “Self-Tuning Gains a Quadrotor Using Simple Model Policy Gradient Reinforcement Learning,” AIAA Guidance, Navigation, Conference, 2016-1387, 2016. https://doi.org/10.2514/6.2016-1387 LinkGoogle Scholar[3] Ferrari S. Stengel R. F., “Online Critic Journal Control, Dynamics, 27, No. 5, 2004, pp. 777–786. https://doi.org/10.2514/1.12597 Scholar[4] Y., E.-J. Chu Q. P., “Nonlinear Incremental Approximate Dynamic Feedback,” 40, 2, 2016, 493–496. https://doi.org/10.2514/1.G001762 Scholar[5] Heydari A. Balakrishnan N., “Adaptive Critic-Based Solution an Orbital Rendezvous Problem,” 37, 1, 2014, 344–350. https://doi.org/10.2514/1.60553 Scholar[6] Nonlinear Tracking Partial Observability,” 41, 12, 2018, 2554–2567. https://doi.org/10.2514/1.G003472 Scholar[7] Sutton Barto G., Learning: An Introduction, MIT Press, Cambridge, MA, 1–4. Scholar[8] S., G. Williams “Reinforcement Learning Is Direct IEEE Systems Magazine, 1992, 19–22. https://doi.org/10.1109/37.126844 CrossrefGoogle Scholar[9] “Launch Vehicle Discrete-Time Programming,” 2020 Conference on Technology (CCTA), Inst. Electrical Electronics Engineers, New York, 162–167. https://doi.org/10.1109/CCTA41146.2020.9206252 Scholar[10] Lewis F. L. Vrabie D., Circuits 9, 3, 2009, 32–50. https://doi.org/10.1109/MCAS.2009.933854 Scholar[11] Based Online Practice, 73, April 13–25. https://doi.org/10.1016/j.conengprac.2017.12.011 Scholar[12] “Finite-Horizon Control-Constrained Single Network Critics,” Transactions Neural Networks Systems, 24, 2013, 145–157. https://doi.org/10.1109/TNNLS.2012.2227339 Scholar[13] Al-Tamimi A., Abu-Khalaf M., “Discrete-Time HJB Programming: Convergence Proof,” Man, Cybernetics, Part B (Cybernetics), 38, 4, 2008, 943–949. https://doi.org/10.1109/TSMCB.2008.926614 Scholar[14] Modares H. L., “Optimal Partially-Unknown Constrained-Input Integral Automatica, 50, 7, 1780–1792. https://doi.org/10.1016/j.automatica.2014.05.011 Scholar[15] Kiumarsi “Actor–Critic-Based Partially Unknown Systems,” 26, 2015, 140–151. https://doi.org/10.1109/TNNLS.2014.2358227 Scholar[16] Van P. Mulder “Continuous Aided Approximated Plant Dynamics,” Exhibit, 2006-6429, 2006. https://doi.org/10.2514/6.2006-6429 Scholar[17] System Identification 366–371. https://doi.org/10.1109/CCTA41146.2020.9206261 Scholar[18] Li H., Tan W., Jia Liu X., “Switching 43, 1352–1358. https://doi.org/10.2514/1.G004519 Scholar[19] IFAC-PapersOnLine, 52, 29, 2019, 7–12. https://doi.org/10.1016/j.ifacol.2019.12.613 Scholar[20] Vamvoudakis K. Observable Processes: Measured Data,” 2010, 14–25. https://doi.org/10.1109/TSMCB.2010.2043839 Scholar[21] Tandale M. D. Valasek Inversion Actuator Saturation Constraints Spacecraft Maneuvers,” the Astronautical Sciences, 517–530. https://doi.org/10.1007/BF03546415 Scholar[22] Sonneveldt J. Design Constrained Backstepping,” 30, 2007, 322–336. https://doi.org/10.2514/1.25834 Scholar[23] Naghibi-Sistani M.-B., “Integral Experience Replay Continuous-Time 193–202. https://doi.org/10.1016/j.automatica.2013.09.043 Scholar[24] Oort E. R., Trajectory F-16 Model,” 32, 25–39. https://doi.org/10.2514/1.38785 Scholar[25] Nguyen Ogburn Gilbert Kibler K., Brown Deal “Simulator Study Stall/Post-Stall Characteristics Fighter Airplane relaxed Longitudinal Static Stability,” NASA TP 1538, 1979. Scholar Previous article Next FiguresReferencesRelatedDetailsCited byObserver-based optimal control method combination event-triggered strategy hypersonic morphing vehicleAerospace Science 136A Method Manned Lunar Mission via Reshaping Rewards31 January 2023Vision-Based Morphing Wing With Mechanical ImperfectionsIEEE Electronic 58, 6Test Evaluation Robustness Testing Explainable AI High-Speed VehiclesEvent-triggered intelligent critic input constraints applied nonlinear aeroelastic systemAerospace 120Event-triggered constrained using explainable global dual heuristic programming discrete-time systemsNeurocomputing, 468 What's Popular Volume 44, Number 9September 2021 CrossmarkInformationCopyright © American Institute Aeronautics Astronautics, Inc. All rights reserved. requests copying permission reprint should be submitted CCC at www.copyright.com; employ eISSN 1533-3884 initiate your request. See also Rights Permissions www.aiaa.org/randp. TopicsArtificial IntelligenceArtificial NetworkComputing SystemComputing InformaticsComputing, Information, CommunicationControl TheoryData ScienceFeedback ControlGuidance, SystemsMachine LearningOptimal Theory KeywordsFull State FeedbackReinforcement LearningFlight ControlElevator DeflectionTracking ControlANNBang Bang ControlNonlinear SystemsAerospace SystemMonte Carlo SimulationAcknowledgmentThe authors would like thank Chinese Scholarship Council financial support project reference number 201806290007.PDF Received10 October 2020Accepted1 May 2021Published online8 June
منابع مشابه
Robust Control of Encoderless Synchronous Reluctance Motor Drives Based on Adaptive Backstepping and Input-Output Feedback Linearization Techniques
In this paper, the design and implementation of adaptive speed controller for a sensorless synchronous reluctance motor (SynRM) drive system is proposed. A combination of well-known adaptive input-output feedback linearization (AIOFL) and adaptive backstepping (ABS) techniques are used for speed tracking control of SynRM. The AIOFL controller is capable of estimating motor two-axis inductances ...
متن کاملOutput Feedback Control of Parabolic PDE Systems with Input Constraints
This paper proposes a methodology for output feedback control of parabolic PDE systems with input constraints. Initially, Galerkin’s method is used for the derivation of a finite-dimensional ODE system that captures the dominant dynamics of the PDE system. This ODE system is then used as the basis for the synthesis, via Lyapunov techniques, of stabilizing bounded output feedback control laws th...
متن کاملrobust control of encoderless synchronous reluctance motor drives based on adaptive backstepping and input-output feedback linearization techniques
in this paper, the design and implementation of adaptive speed controller for a sensorless synchronous reluctance motor (synrm) drive system is proposed. a combination of well-known adaptive input-output feedback linearization (aiofl) and adaptive backstepping (abs) techniques are used for speed tracking control of synrm. the aiofl controller is capable of estimating motor two-axis inductances ...
متن کاملAdaptive Input-Output Linearization Control of pH Processes
pH control is a challenging problem due to its highly nonlinear nature. In this paper the performances of two different adaptive global linearizing controllers (GLC) are compared. Least squares technique has been used for identifying the titration curve. The first controller is a standard GLC based on material balances of each species. For implementation of this controller a nonlinear state...
متن کاملOutput feedback control with input saturations :
This paper addresses the control of linear systems with input saturations. We seek a controller that guarantees for the closed loop system: (i) stability for a given polytope of initial conditions, (ii) a prescribed weak L 2 gain attenuation between inputs and outputs of interest. Two approaches are proposed based on: (i) ensuring that the controller never saturates: the obtained controller is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Guidance Control and Dynamics
سال: 2021
ISSN: ['1533-3884', '0731-5090']
DOI: https://doi.org/10.2514/1.g005715